Malaria Detection in Blood Smeared Images Using Convolutional Neural Networks

Authors: Harsh Raj, Dr. S. Thanga Revathi, Shikhar Srivastava

DOI Link: https://doi.org/10.22214/ijraset.2023.52306

Abstract

With 200 million cases annually, malaria often claims more lives than crisis and deadly wars. Given the ineffectiveness of efforts to lower mortality rates, insufficient malaria diagnosis is one of the hurdles to a successful and efficient reduction in fatality. Hence, malaria is one of the major causes of deaths and diseases in many developing countries, where young children and prenatal mothers are the most impacted populations. The parasite called Plasmodium is the source for the potentially fatal disease named malaria. Highly trained and experienced microscopists observe minute blood smeared images to look for the parasite. Modern deep learning techniques could automate the completion of the required analysis. With the development of an independent, accurate, and useful model the demand for skilled staff can be significantly decreased. In this study, we offer a totally automated method and approach for the diagnosis and classification of malaria using microscopic blood-smeared pictures based on convolutional neural networks-(CNN).

Introduction

I. INTRODUCTION

Humans are bitten by female Anopheles mosquitoes containing Plasmodium parasites from protozoa, which attack and expand red blood cells, resulting in malaria. According to the WHO, more than 3.3 billion people around the world are at a huge risk of contracting malaria each year. According to a World Health Organisation survey, 91 nations recorded more than 217 million cases for the disease. The African region has the highest concentration of malaria cases worldwide, next trailed by South-eastern Asia and then the East Mediterranean region. Frequent malarial symptoms include high fever, exhaustion, headaches, body aches, and, in very severe cases, seizures along with coma, which all can be lethal if untreated quickly. Malaria is a deadly illness that can effectively be avoided and treated to control it. It is a disease that advances quickly once it has been contracted. In many emerging and developing-country populations, malaria is the leading cause of mortality, placing a major burden on our healthcare system. It is endemic in many of the different regions in the world, which implies that people there frequently contract the disease. Therefore, in order to save lives, early malaria diagnosis and treatment are crucial. We are inspired to improve malaria diagnostics' efficacy and timeliness in the future as a result.

II. BACKROUND STUDY

Robert L. Clark,(2021) In the first trimester of pregnancy, there have recently been recommendations for the usage of artemisinin based combination treatments (ACTs) for simple malaria. The WHO Guidelines for Malaria from 2021, however, reiterated their stance that sufficient clinical safety information is not available on artemisinins to warrant such use. The WHO's stance is in line with a number of problems with the clinical data that is currently available. First off, a meta-analysis that pooled the safety information for several ACTs throughout the first trimester could not establish that all of the included ACTs were equally safe. Second, because all first trimester periods' safety data were merged for the meta-analysis, not all subperiods—particularly gestational Weeks 6–8, which is likely the most vulnerable period—show the same level of safety. Third, even if an individual ACT is proven to have no impact on miscarriage rates, that does not imply that all ACTs are without developmental consequences.
Amin Alqudah, Shoroq Qazan, Ali Mohammad Alqudah,(2020) Plasmodium, a single-celled parasite, is the source of the infectious disease malaria. Typically, a female anopheles mosquito afflicted with this disease is used to spread malaria. Recent statistics show that only 229 million instances of malaria were reported worldwide in 2018 and that the illness was responsible for 435,000 fatalities. The vulnerable portion of the global population is still over 40%. Nevertheless, a number of image analysis and machine learning methods have been developed by scientists to take advantage of blood smear images and early malaria detection. In this study, segmented contaminated and healthy red blood cells were classified using a newly created CNN algorithm that takes advantage of transfer learning. The experimental outcomes demonstrate that, amongst all previously employed CNN models, the suggested architecture effectively detects malaria having a high precision of 98.85%, a sensitivity of roughly 98.79%, and a precision of approximately 98.90% while functioning at the quickest speed and lowest input size.
Ali Mohammad Alqudah,(2020) OCT (optical coherence tomography) technology was initially employed for two-dimensional eye imaging, yet it has since grown into among the most important and well-liked methods of imaging to perform noninvasive examination of retinal retinal diseases. Age-related macular degeneration (AMD) and diabetes macular swelling are both of the most prevalent reasons for blindness discovered by OCT. It is becoming more difficult to classify eye retina illnesses using OCT pictures due to recent advancements in machine learning as well as deep learning techniques. This research proposes a novel automated convolutional neural network (CNN) architecture for a multiclass classification system that uses spectral domain OCT (SD-OCT). The methodology used to categorise common occurrences as well as five major forms of retinal illnesses, including age-related macular degeneration (AMD), drusen, choroidal neovascularization, and diabetic macular edoema. The suggested CNN architecture with a softmax classifier correctly recognised AMD in completely of cases and CNV in 98.86% of cases giving a total precision of 95.30%. 99.17% of DME cases, 98.97% of drusen cases, and 99.15% of typical cases. Using SD-OCT scans, this idea may prove to have a useful tool for the early diagnosis of retinal issues.
Aimon Rahman, Hasib Zunair, M Sohel Rahman,(2019) Most of humanity is afflicted with malaria, an illness that can be fatal that is spread by female anopheles mosquito bites. In this research, patches segmented from microscopic pictures of red blood cell smears are used to improve malaria diagnosis using deep convolutional neural networks. Unlike many previous methods that require time-consuming manually extraction of features, the recommended method performs feature extraction as well as classification straight from the raw, fragmented patches found in red blood smears. The research's dataset was the Malaria Database from the National Institutes of Health. To compare and choose the best-performing design, accuracy and loss evaluation criteria were used together with 5-fold cross validation. Several other intricate structures have been tested and put into practise to see which one performs the best. To determine how effectively the suggested model generalises to fresh data, a test was also carried out. The accuracy of our best model is about 97.77%.
Jane Hung, Anne Carpenter, (2017) Despite the enormous success of deep learning-based object detection models, biological picture data has not yet seen widespread use of the most advanced techniques. For the initial time, we apply an object detection model for identifying cells and their stages in brightfield microscopy pictures of blood that has been contaminated with malaria. This model has already been applied to recognise objects in natural images. There are still many microorganisms being examined utilising skilled inspectors and hand counting, especially parasites that cause malaria. This kind of identifying objects task is challenging because of the variations in cell structure, the density, and colour in addition to the unreliability of some cell classes. Furthermore, because healthy red blood cells predominate, the order of distribution of classes is very unbalanced, making it challenging to locate data with annotations that is useful for training. Our study made use of (Faster R-CNN), among the most effective models for object recognition in recent years. It was refined using training data from ImageNet before being enhanced with our data.

III. METHODOLOGY

Malaria cell images are collected from Kaggle. Kaggle obtained the dataset for the project from the official NIH website: https://ceb.nlm.nih.gov/repositories/malaria-datasets/.

The methods and steps for the project are as follows:

Data Collection and Pre-processing

Assemble a database of images depicting malaria. Then Prior to the image pre-procession, split the dataset into training sets and validation sets, normalise the pixel values, and re-size the pictures to a fixed size (for example, 224x224).

2. Model Selection and Training

Choose CNN architectures such as VGG16 model, Resnet-50, and Inception for malaria classification. Initialize the pre-trained model with ImageNet weights. Train the models using the training set of malaria images for a fixed number of epochs with a chosen batch size.

3. Model Evaluation

Compute the training loss and the validation loss curves and also the accuracy curves for each model. Compare all the three models with the help of metrics like F1-score, recall, accuracy, and precision.

4. Modified VGG Hybrid Model

Create a modified VGG hybrid version by combining the features of VGG16 with other CNN architectures. Train the hybrid model using the same dataset and hyperparameters as the other models.

5. Model Comparison

Contrast the performance of the customized and modified VGG hybrid model with the other models. Determine which model has the highest accuracy and choose it as the final model.

6. Refer the below Diagrams

IV. WORKING

The VGG16 hybrid deep learning model is a model that combines a custom CNN architecture with a pre-trained VGG16 network. The model is designed for binary classification tasks, and is trained using image data.

The first part of the code sets up the data generators for training and testing. The ImageDataGenerator class is used to perform data augmentation, which helps prevent overfitting and improves generalization performance. Using the flow_from_directory method, the train and test data are loaded from a directory.

The main architecture of the model is defined using the Keras functional API. The input layer is defined to accept 224x224 RGB images. Convolutional layers with progressively more filters and max pooling layers in between are then added in a series. The use of batch normalization accelerates convergence and lessens overfitting. The output from each new convolutional layer is then combined with the output from the first three layers in a separate pool. The feature vector is created by concatenating the generated feature maps and then flattening them. Following a fully connected layer, this vector is passed through, and the output layer employs softmax activation to produce a probability distribution over the two classes. For the input photos, the pre-trained VGG16 network serves as a feature extractor. Concatenating the output of the VGG16 network with the output of the unique CNN architecture, and this concatenated output is passed through the fully-connected layer to perform classification.

The binary cross-entropy loss function and stochastic gradient descent optimizer are used to build the model. The ModelCheckpoint callback is used to track the model's training and testing accuracy throughout the course of 100 epochs of training. The model that is created can be kept and applied to new data to make predictions. Overall, this architecture is made for binary image classification problems and combines a bespoke CNN with a pre-trained VGG16 network. It is trained using stochastic gradient descent and enhances performance via batch normalization and data augmentation.

V. RESULT AND ANALYSIS

To compare with different models, we used a hybrid model that incorporates a convolutional neural network (CNN) and the VGG16 architecture.. The model first extracts feature from the photos using VGG16, and then classifies the images using CNN. Convolutional and max-pooling layers are then joined by fully connected layers in the CNN design. To avoid overfitting, the model additionally employs batch normalisation and dropout. The photos are divided into two categories by the last dense layer using the softmax activation function. To reduce the loss function, the model employs the stochastic gradient descent (SGD) optimizer. Using a binary loss of cross-entropy function, the algorithm is trained. On execution, we easily find that this model is more stable and has less validation and training loss when compared to other models. It is accurate and does not fluctuate in accuracy readings over epochs. This can be understood better with the following graphs for the comparison.

As we can see from the graphs below, there are initially fluctuations in the accuracy curve but a smooth curve is acquired as the model is trained through epochs. Finally, the graph becomes stable as training accuracy tends to 100 percent whereas testing accuracy touches 97.7 percent.

In Inception, the model does not show the progress as expected. Thea accuracy hovers around the 50 percent mark and there is a lot of instability and disturbance.

In resnet the model touches 90% mark in accuracy but it goes under many fluctuations and is unstable. Hence, we can conclude that Hybrid VGG16 Model is the most efficient and productive when compared to other models

Also, the saved model can be loaded to pass an input image through the model, and predict whether the image contains malaria or not. For example, taking random images from the above given dataset and feeding it to the model gives us the much-required classification of malaria. Some outputs obtained were:

Image cells for malaria classification -

Conclusion

The modified VGG16 model, which combines the layers of VGG16 and Inception, achieved the greatest performance among all the models assessed, it can be inferred from the analysis and findings. With reduced fluctuation and a more consistent accuracy curve during training, the model was able to reach high accuracy. The testing accuracy was 97% and the validation accuracy was 100%, which was superior to all other models considered. The performance of the models can be further improved by exploring more sophisticated techniques like transfer learning or fine-tuning of pre-trained models. To get even better outcomes, it may also be investigated to incorporate other sophisticated approaches as data augmentation, ensemble learning, and hyperparameter tweaking. Furthermore, the model can be deployed on various platforms for real-time diagnosis of malaria.

References

[1] WHO, World Malaria Report, Geneva World Heal. Organ, Geneva, Switzerland, 2017. [2] WHO, “WHO guidelines for malaria,” p. 225, 2021, https:// apps.who.int/iris/handle/10665/339609. [3] R. Zemouri, N. Zerhouni, and D. Racoceanu, “Deep learning in the biomedical applications: recent and future status,” Applied Sciences, vol. 9, no. 8, p. 1526, 2019. [4] I. Tobore, J. Li, L. Yuhang et al., “Deep learning intervention for health care challenges: some biomedical domain considerations,” JMIR mHealth and uHealth, vol. 7, no. 8, Article ID e11966, 2019. [5] K. Suzuki, “Overview of deep learning in medical imaging,” Radiological Physics and Technology, vol. 10, no. 3, pp. 257– 273, 2017. [6] N. Tavakoli, M. Karimi, A. Norouzi, N. Karimi, S. Samavi, and S. M. R. Soroushmehr, “Detection of abnormalities in mammograms using deep features,” Journal of Ambient Intelligence and Humanized Computing, vol. 1-13, 2019. [7] R. Dinesh Jackson Samuel and B. Rajesh Kanna, “Tuberculosis (TB) detection system using deep neural networks,” Neural Computing and Applications, vol. 31, no. 5, pp. 1533–1545, 2019. [8] R. Sivaramakrishnan, S. Antani, Z. Xue, S. Candemir, S. Jaeger, and G. R. %oma, “Visualizing Abnormalities in Chest Radiographs through Salient Network Activations in Deep Learning,” in Proceedings of the 2017 IEEE Life Sciences Conference (LSC), pp. 71–74, Sydney, NSW, Australia, December 2017. [9] J. Hung and A. Carpenter, “Applying faster R-CNN for object detection on malaria images,” in Proceedings of the 2017 IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. Work, pp. 1–7, Honolulu, HI, USA, July 2017. [10] Y. Dong, Z. Jiang, W. Shen David Pan et al., “Evaluations of Deep Convolutional Neural Networks for Automatic Identification of Malaria Infected Cells,” in Proceedings of the 2017 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI), pp. 101–104, Orlando, FL, USA, February 2017. [11] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradientbased learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998. [12] M. D. Zeiler and R. Fergus, “Visualizing and Understanding Convolutional Networks,” in Proceedings of the Computer Vision - ECCV 2014, pp. 818–833, Zurich, Switzerland, September 2014. [13] C. SzegedyWei Liu, Y. Yangqing Jia, P. Sermanet et al., “Going deeper with convolutions,” in Proceedings of the 2015 (CVPR), pp. 1–9, IEEE, Boston, MA, June 2015. [14] Z. Liang, A. Powell, I. Ersoy et al., “CNN-based image analysis for malaria diagnosis,” in Proceedings of the 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 493–496, Shenzhen, China, December 2016. [15] D. Bibin, M. S. Nair, and P. Punitha, “Malaria parasite detection from peripheral blood smear images using deep belief networks,” IEEE Access, vol. 5, pp. 9099–9108, 2017. [16] F. Shaik, A. Kumar Sharma, S. Musthak Ahmed, V. Kumar Gunjan, and C. Naik, “An improved model for analysis of diabetic Retinopathy related imagery,” Indian Journal of Science and Technology, vol. 9, no. 44, pp. 1–6, 2016. [17] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, pp. 1–14, San Diego, CA, USA, May 2015. [18] V. K. Gunjan, P. S. Prasad, and S. Mukherjee, “Biometric template protection scheme-cancelable biometrics,” in ICCCE 2019 Lecture Notes in Electrical Engineering, A. Kumar and S. Mozar, Eds., vol. 570, pp. 405–411, Springer, Singapore, 2020. [19] E. Rashid, M. D. Ansari, V. K. Gunjan, and M. Khan, “Enhancement in teaching quality methodology by predicting attendance using machine learning technique,” in Modern Approaches in Machine Learning and Cognitive Science: A Walkthrough. Studies in Computational Intelligence, V. Gunjan, J. Zurada, B. Raman, and G. Gangadharan, Eds., vol. 885, pp. 227–235, Springer, Cham, 2020. [20] M. D. Ansari, V. K. Gunjan, and E. Rashid, “On security and data integrity framework for cloud computing using tamperproofing,” in ICCCE 2020 Lecture Notes in Electrical Engineering, A. Kumar and S. Mozar, Eds., vol. 698, pp. 1419–1427, Springer, Singapore, 2021.

Copyright

Copyright © 2023 Harsh Raj, Dr. S. Thanga Revathi, Shikhar Srivastava . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET52306

Publish Date : 2023-05-15

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here